Search Results: "mattia"

23 September 2022

Reproducible Builds (diffoscope): diffoscope 222 released

The diffoscope maintainers are pleased to announce the release of diffoscope version 222. This version includes the following changes:
[ Mattia Rizzolo ]
* Use pep517 and pip to load the requirements. (Closes: #1020091)
* Remove old Breaks/Replaces in debian/control that have been obsoleted since
  bullseye
You find out more by visiting the project homepage.

9 September 2022

Reproducible Builds: Reproducible Builds in August 2022

Welcome to the August 2022 report from the Reproducible Builds project! In these reports we outline the most important things that we have been up to over the past month. As a quick recap, whilst anyone may inspect the source code of free software for malicious flaws, almost all software is distributed to end users as pre-compiled binaries. The motivation behind the reproducible builds effort is to ensure no flaws have been introduced during this compilation process by promising identical results are always generated from a given source, thus allowing multiple third-parties to come to a consensus on whether a build was compromised. As ever, if you are interested in contributing to the project, please visit our Contribute page on our website.

Community news As announced last month, registration is currently open for our in-person summit this year which is due to be held between November 1st November 3rd. The event will take place in Venice (Italy). Very soon we intend to pick a venue reachable via the train station and an international airport. However, the precise venue will depend on the number of attendees. Please see the announcement email for information about how to register.
The US National Security Agency (NSA), Cybersecurity and Infrastructure Security Agency (CISA) and the Office of the Director of National Intelligence (ODNI) have released a document called Securing the Software Supply Chain: Recommended Practices Guide for Developers (PDF) as part of their Enduring Security Framework (ESF) work. The document expressly recommends having reproducible builds as part of advanced recommended mitigations, along with hermetic builds. Page 31 (page 35 in the PDF) says:
Reproducible builds provide additional protection and validation against attempts to compromise build systems. They ensure the binary products of each build system match: i.e., they are built from the same source, regardless of variable metadata such as the order of input files, timestamps, locales, and paths. Reproducible builds are those where re-running the build steps with identical input artifacts results in bit-for-bit identical output. Builds that cannot meet this must provide a justification why the build cannot be made reproducible.
The full press release is available online.
On our mailing list this month, Marc Prud hommeaux posted a feature request for diffoscope which additionally outlines a project called The App Fair, an autonomous distribution network of free and open-source macOS and iOS applications, where validated apps are then signed and submitted for publication .
Author/blogger Cory Doctorow posted published a provocative blog post this month titled Your computer is tormented by a wicked god . Touching on Ken Thompson s famous talk, Reflections on Trusting Trust , the early goals of Secure Computing and UEFI firmware interfaces:
This is the core of a two-decade-old debate among security people, and it s one that the benevolent God faction has consistently had the upper hand in. They re the curated computing advocates who insist that preventing you from choosing an alternative app store or side-loading a program is for your own good because if it s possible for you to override the manufacturer s wishes, then malicious software may impersonate you to do so, or you might be tricked into doing so. [..] This benevolent dictatorship model only works so long as the dictator is both perfectly benevolent and perfectly competent. We know the dictators aren t always benevolent. [ ] But even if you trust a dictator s benevolence, you can t trust in their perfection. Everyone makes mistakes. Benevolent dictator computing works well, but fails badly. Designing a computer that intentionally can t be fully controlled by its owner is a nightmare, because that is a computer that, once compromised, can attack its owner with impunity.

Lastly, Chengyu HAN updated the Reproducible Builds website to correct an incorrect Git command. [ ]

Debian In Debian this month, the essential and required package sets became 100% reproducible in Debian bookworm on the amd64 and arm64 architectures. These two subsets of the full Debian archive refer to Debian package priority levels as described in the 2.5 Priorities section of the Debian Policy there is no canonical minimal installation package set in Debian due to its diverse methods of installation. As it happens, these package sets are not reproducible on the i386 architecture because the ncurses package on that architecture is not yet reproducible, and the sed package currently fails to build from source on armhf too. The full list of reproducible packages within these package sets can be viewed within our QA system, such as on the page of required packages in amd64 and the list of essential packages on arm64, both for Debian bullseye.
It recently has become very easy to install reproducible Debian Docker containers using podman on Debian bullseye:
$ sudo apt install podman
$ podman run --rm -it debian:bullseye bash
The (pre-built) image used is itself built using debuerrotype, as explained on docker.debian.net. This page also details how to build the image yourself and what checksums are expected if you do so.
Related to this, it has also become straightforward to reproducibly bootstrap Debian using mmdebstrap, a replacement for the usual debootstrap tool to create Debian root filesystems:
$ SOURCE_DATE_EPOCH=$(date --utc --date=2022-08-29 +%s) mmdebstrap unstable > unstable.tar
This works for (at least) Debian unstable, bullseye and bookworm, and is tested automatically by a number of QA jobs set up by Holger Levsen (unstable, bookworm and bullseye)
Work has also taken place to ensure that the canonical debootstrap and cdebootstrap tools are also capable of bootstrapping Debian reproducibly, although it currently requires a few extra steps:
  1. Clamping the modification time of files that are newer than $SOURCE_DATE_EPOCH to be not greater than SOURCE_DATE_EPOCH.
  2. Deleting a few files. For debootstrap, this requires the deletion of /etc/machine-id, /var/cache/ldconfig/aux-cache, /var/log/dpkg.log, /var/log/alternatives.log and /var/log/bootstrap.log, and for cdebootstrap we also need to delete the /var/log/apt/history.log and /var/log/apt/term.log files as well.
This process works at least for unstable, bullseye and bookworm and is now being tested automatically by a number of QA jobs setup by Holger Levsen [ ][ ][ ][ ][ ][ ]. As part of this work, Holger filed two bugs to request a better initialisation of the /etc/machine-id file in both debootstrap [ ] and cdebootstrap [ ].
Elsewhere in Debian, 131 reviews of Debian packages were added, 20 were updated and 27 were removed this month, adding to our extensive knowledge about identified issues. Chris Lamb added a number of issue types, including: randomness_in_browserify_output [ ], haskell_abi_hash_differences [ ], nondeterministic_ids_in_html_output_generated_by_python_sphinx_panels [ ]. Lastly, Mattia Rizzolo removed the deterministic flag from the captures_kernel_variant flag [ ].

Other distributions Vagrant Cascadian posted an update of the status of Reproducible Builds in GNU Guix, writing that:
Ignoring the pesky unknown packages, it is more like ~93% reproducible and ~7% unreproducible... that feels a bit better to me! These numbers wander around over time, mostly due to packages moving back into an "unknown" state while the build farms catch up with each other... although the above numbers seem to have been pretty consistent over the last few days.
The post itself contains a lot more details, including a brief discussion of tooling. Elsewhere in GNU Guix, however, Vagrant updated a number of packages such as itpp [ ], perl-class-methodmaker [ ], libnet [ ], directfb [ ] and mm-common [ ], as well as updated the version of reprotest to 0.7.21 [ ]. In openSUSE, Bernhard M. Wiedemann published his usual openSUSE monthly report.

diffoscope diffoscope is our in-depth and content-aware diff utility. Not only can it locate and diagnose reproducibility issues, it can provide human-readable diffs from many kinds of binary formats. This month, Chris Lamb prepared and uploaded versions 220 and 221 to Debian, as well as made the following changes:
  • Update external_tools.py to reflect changes to xxd and the vim-common package. [ ]
  • Depend on the dedicated xxd package now, not the vim-common package. [ ]
  • Don t crash if we can open a PDF file using the PyPDF library, but cannot subsequently parse the annotations within. [ ]
In addition, Vagrant Cascadian updated diffoscope in GNU Guix, first to to version 220 [ ] and later to 221 [ ].

Community news The Reproducible Builds project aims to fix as many currently-unreproducible packages as possible as well as to send all of our patches upstream wherever appropriate. This month we created a number of patches, including:

Testing framework The Reproducible Builds project runs a significant testing framework at tests.reproducible-builds.org, to check packages and other artifacts for reproducibility. This month, Holger Levsen made the following changes:
  • Debian-related changes:
    • Temporarily add Debian unstable deb-src lines to enable test builds a Non-maintainer Upload (NMU) campaign targeting 708 sources without .buildinfo files found in Debian unstable, including 475 in bookworm. [ ][ ]
    • Correctly deal with the Debian Edu packages not being installable. [ ]
    • Finally, stop scheduling stretch. [ ]
    • Make sure all Ubuntu nodes have the linux-image-generic kernel package installed. [ ]
  • Health checks & view:
    • Detect SSH login problems. [ ]
    • Only report the first uninstallable package set. [ ]
    • Show new bootstrap jobs. [ ] and debian-live jobs. [ ] in the job health view.
    • Fix regular expression to detect various zombie jobs. [ ]
  • New jobs:
    • Add a new job to test reproducibility of mmdebstrap bootstrapping tool. [ ][ ][ ][ ]
    • Run our new mmdebstrap job remotely [ ][ ]
    • Improve the output of the mmdebstrap job. [ ][ ][ ]
    • Adjust the mmdebstrap script to additionally support debootstrap as well. [ ][ ][ ]
    • Work around mmdebstrap and debootstrap keeping logfiles within their artifacts. [ ][ ][ ]
    • Add support for testing cdebootstrap too and add such a job for unstable. [ ][ ][ ]
    • Use a reproducible value for SOURCE_DATE_EPOCH for all our new bootstrap jobs. [ ]
  • Misc changes:
    • Send the create_meta_pkg_sets notification to #debian-reproducible-changes instead of #debian-reproducible. [ ]
In addition, Roland Clobus re-enabled the tests for live-build images [ ] and added a feature where the build would retry instead of give up when the archive was synced whilst building an ISO [ ], and Vagrant Cascadian added logging to report the current target of the /bin/sh symlink [ ].

Contact As ever, if you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

4 August 2022

Reproducible Builds: Reproducible Builds in July 2022

Welcome to the July 2022 report from the Reproducible Builds project! In our reports we attempt to outline the most relevant things that have been going on in the past month. As a brief introduction, the reproducible builds effort is concerned with ensuring no flaws have been introduced during this compilation process by promising identical results are always generated from a given source, thus allowing multiple third-parties to come to a consensus on whether a build was compromised. As ever, if you are interested in contributing to the project, please visit our Contribute page on our website.

Reproducible Builds summit 2022 Despite several delays, we are pleased to announce that registration is open for our in-person summit this year: November 1st November 3rd
The event will happen in Venice (Italy). We intend to pick a venue reachable via the train station and an international airport. However, the precise venue will depend on the number of attendees. Please see the announcement email for information about how to register.

Is reproducibility practical? Ludovic Court s published an informative blog post this month asking the important question: Is reproducibility practical?:
Our attention was recently caught by a nice slide deck on the methods and tools for reproducible research in the R programming language. Among those, the talk mentions Guix, stating that it is for professional, sensitive applications that require ultimate reproducibility , which is probably a bit overkill for Reproducible Research . While we were flattered to see Guix suggested as good tool for reproducibility, the very notion that there s a kind of reproducibility that is ultimate and, essentially, impractical, is something that left us wondering: What kind of reproducibility do scientists need, if not the ultimate kind? Is reproducibility practical at all, or is it more of a horizon?
The post goes on to outlines the concept of reproducibility, situating examples within the context of the GNU Guix operating system.

diffoscope diffoscope is our in-depth and content-aware diff utility. Not only can it locate and diagnose reproducibility issues, it can provide human-readable diffs from many kinds of binary formats. This month, Chris Lamb prepared and uploaded versions 218, 219 and 220 to Debian, as well as made the following changes:
  • New features:
  • Bug fixes:
    • Fix a regression introduced in version 207 where diffoscope would crash if one directory contained a directory that wasn t in the other. Thanks to Alderico Gallo for the testcase. [ ]
    • Don t traceback if we encounter an invalid Unicode character in Haskell versioning headers. [ ]
  • Output improvements:
  • Codebase improvements:
    • Space out a file a little. [ ]
    • Update various copyright years. [ ]

Mailing list On our mailing list this month:
  • Roland Clobus posted his Eleventh status update about reproducible [Debian] live-build ISO images, noting amongst many other things! that all major desktops build reproducibly with bullseye, bookworm and sid.
  • Santiago Torres-Arias announced a Call for Papers (CfP) for a new SCORED conference, an academic workshop around software supply chain security . As Santiago highlights, this new conference invites reviewers from industry, open source, governement and academia to review the papers [and] I think that this is super important to tackle the supply chain security task .

Upstream patches The Reproducible Builds project attempts to fix as many currently-unreproducible packages as possible. This month, however, we submitted the following patches:

Reprotest reprotest is the Reproducible Builds project s end-user tool to build the same source code twice in widely and deliberate different environments, and checking whether the binaries produced by the builds have any differences. This month, the following changes were made:
  • Holger Levsen:
    • Uploaded version 0.7.21 to Debian unstable as well as mark 0.7.22 development in the repository [ ].
    • Make diffoscope dependency unversioned as the required version is met even in Debian buster. [ ]
    • Revert an accidentally committed hunk. [ ]
  • Mattia Rizzolo:
    • Apply a patch from Nick Rosbrook to not force the tests to run only against Python 3.9. [ ]
    • Run the tests through pybuild in order to run them against all supported Python 3.x versions. [ ]
    • Fix a deprecation warning in the setup.cfg file. [ ]
    • Close a new Debian bug. [ ]

Reproducible builds website A number of changes were made to the Reproducible Builds website and documentation this month, including:
  • Arnout Engelen:
  • Chris Lamb:
    • Correct some grammar. [ ]
  • Holger Levsen:
    • Add talk from FOSDEM 2015 presented by Holger and Lunar. [ ]
    • Show date of presentations if we have them. [ ][ ]
    • Add my presentation from DebConf22 [ ] and from Debian Reunion Hamburg 2022 [ ].
    • Add dhole to the speakers of the DebConf15 talk. [ ]
    • Add raboof s talk Reproducible Builds for Trustworthy Binaries from May Contain Hackers. [ ]
    • Drop some Debian-related suggested ideas which are not really relevant anymore. [ ]
    • Add a link to list of packages with patches ready to be NMUed. [ ]
  • Mattia Rizzolo:
    • Add information about our upcoming event in Venice. [ ][ ][ ][ ]

Testing framework The Reproducible Builds project runs a significant testing framework at tests.reproducible-builds.org, to check packages and other artifacts for reproducibility. This month, Holger Levsen made the following changes:
  • Debian-related changes:
    • Create graphs displaying existing .buildinfo files per each Debian suite/arch. [ ][ ]
    • Fix a typo in the Debian dashboard. [ ][ ]
    • Fix some issues in the pkg-r package set definition. [ ][ ][ ]
    • Improve the builtin-pho HTML output. [ ][ ][ ][ ]
    • Temporarily disable all live builds as our snapshot mirror is offline. [ ]
  • Automated node health checks:
    • Detect dpkg failures. [ ]
    • Detect files with bad UNIX permissions. [ ]
    • Relax a regular expression in order to detect Debian Live image build failures. [ ]
  • Misc changes:
    • Test that FreeBSD virtual machine has been updated to version 13.1. [ ]
    • Add a reminder about powercycling the armhf-architecture mst0X node. [ ]
    • Fix a number of typos. [ ][ ]
    • Update documentation. [ ][ ]
    • Fix Munin monitoring configuration for some nodes. [ ]
    • Fix the static IP address for a node. [ ]
In addition, Vagrant Cascadian updated host keys for the cbxi4pro0 and wbq0 nodes [ ] and, finally, node maintenance was also performed by Mattia Rizzolo [ ] and Holger Levsen [ ][ ][ ].

Contact As ever, if you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

13 July 2022

Reproducible Builds: Reproducible Builds in June 2022

Welcome to the June 2022 report from the Reproducible Builds project. In these reports, we outline the most important things that we have been up to over the past month. As a quick recap, whilst anyone may inspect the source code of free software for malicious flaws, almost all software is distributed to end users as pre-compiled binaries.

Save the date! Despite several delays, we are pleased to announce dates for our in-person summit this year: November 1st 2022 November 3rd 2022
The event will happen in/around Venice (Italy), and we intend to pick a venue reachable via the train station and an international airport. However, the precise venue will depend on the number of attendees. Please see the announcement mail from Mattia Rizzolo, and do keep an eye on the mailing list for further announcements as it will hopefully include registration instructions.

News David Wheeler filed an issue against the Rust programming language to report that builds are not reproducible because full path to the source code is in the panic and debug strings . Luckily, as one of the responses mentions: the --remap-path-prefix solves this problem and has been used to great effect in build systems that rely on reproducibility (Bazel, Nix) to work at all and that there are efforts to teach cargo about it here .
The Python Security team announced that:
The ctx hosted project on PyPI was taken over via user account compromise and replaced with a malicious project which contained runtime code which collected the content of os.environ.items() when instantiating Ctx objects. The captured environment variables were sent as a base64 encoded query parameter to a Heroku application [ ]
As their announcement later goes onto state, version-pinning using hash-checking mode can prevent this attack, although this does depend on specific installations using this mode, rather than a prevention that can be applied systematically.
Developer vanitasvitae published an interesting and entertaining blog post detailing the blow-by-blow steps of debugging a reproducibility issue in PGPainless, a library which aims to make using OpenPGP in Java projects as simple as possible . Whilst their in-depth research into the internals of the .jar may have been unnecessary given that diffoscope would have identified the, it must be said that there is something to be said with occasionally delving into seemingly low-level details, as well describing any debugging process. Indeed, as vanitasvitae writes:
Yes, this would have spared me from 3h of debugging But I probably would also not have gone onto this little dive into the JAR/ZIP format, so in the end I m not mad.

Kees Cook published a short and practical blog post detailing how he uses reproducibility properties to aid work to replace one-element arrays in the Linux kernel. Kees approach is based on the principle that if a (small) proposed change is considered equivalent by the compiler, then the generated output will be identical but only if no other arbitrary or unrelated changes are introduced. Kees mentions the fantastic diffoscope tool, as well as various kernel-specific build options (eg. KBUILD_BUILD_TIMESTAMP) in order to prepare my build with the known to disrupt code layout options disabled .
Stefano Zacchiroli gave a presentation at GDR S curit Informatique based in part on a paper co-written with Chris Lamb titled Increasing the Integrity of Software Supply Chains. (Tweet)

Debian In Debian in this month, 28 reviews of Debian packages were added, 35 were updated and 27 were removed this month adding to our knowledge about identified issues. Two issue types were added: nondeterministic_checksum_generated_by_coq and nondetermistic_js_output_from_webpack. After Holger Levsen found hundreds of packages in the bookworm distribution that lack .buildinfo files, he uploaded 404 source packages to the archive (with no meaningful source changes). Currently bookworm now shows only 8 packages without .buildinfo files, and those 8 are fixed in unstable and should migrate shortly. By contrast, Debian unstable will always have packages without .buildinfo files, as this is how they come through the NEW queue. However, as these packages were not built on the official build servers (ie. they were uploaded by the maintainer) they will never migrate to Debian testing. In the future, therefore, testing should never have packages without .buildinfo files again. Roland Clobus posted yet another in-depth status report about his progress making the Debian Live images build reproducibly to our mailing list. In this update, Roland mentions that all major desktops build reproducibly with bullseye, bookworm and sid but also goes on to outline the progress made with automated testing of the generated images using openQA.

GNU Guix Vagrant Cascadian made a significant number of contributions to GNU Guix: Elsewhere in GNU Guix, Ludovic Court s published a paper in the journal The Art, Science, and Engineering of Programming called Building a Secure Software Supply Chain with GNU Guix:
This paper focuses on one research question: how can [Guix]((https://www.gnu.org/software/guix/) and similar systems allow users to securely update their software? [ ] Our main contribution is a model and tool to authenticate new Git revisions. We further show how, building on Git semantics, we build protections against downgrade attacks and related threats. We explain implementation choices. This work has been deployed in production two years ago, giving us insight on its actual use at scale every day. The Git checkout authentication at its core is applicable beyond the specific use case of Guix, and we think it could benefit to developer teams that use Git.
A full PDF of the text is available.

openSUSE In the world of openSUSE, SUSE announced at SUSECon that they are preparing to meet SLSA level 4. (SLSA (Supply chain Levels for Software Artifacts) is a new industry-led standardisation effort that aims to protect the integrity of the software supply chain.) However, at the time of writing, timestamps within RPM archives are not normalised, so bit-for-bit identical reproducible builds are not possible. Some in-toto provenance files published for SUSE s SLE-15-SP4 as one result of the SLSA level 4 effort. Old binaries are not rebuilt, so only new builds (e.g. maintenance updates) have this metadata added. Lastly, Bernhard M. Wiedemann posted his usual monthly openSUSE reproducible builds status report.

diffoscope diffoscope is our in-depth and content-aware diff utility. Not only can it locate and diagnose reproducibility issues, it can provide human-readable diffs from many kinds of binary formats. This month, Chris Lamb prepared and uploaded versions 215, 216 and 217 to Debian unstable. Chris Lamb also made the following changes:
  • New features:
    • Print profile output if we were called with --profile and we were killed via a TERM signal. This should help in situations where diffoscope is terminated due to some sort of timeout. [ ]
    • Support both PyPDF 1.x and 2.x. [ ]
  • Bug fixes:
    • Also catch IndexError exceptions (in addition to ValueError) when parsing .pyc files. (#1012258)
    • Correct the logic for supporting different versions of the argcomplete module. [ ]
  • Output improvements:
    • Don t leak the (likely-temporary) pathname when comparing PDF documents. [ ]
  • Logging improvements:
    • Update test fixtures for GNU readelf 2.38 (now in Debian unstable). [ ][ ]
    • Be more specific about the minimum required version of readelf (ie. binutils), as it appears that this patch level version change resulted in a change of output, not the minor version. [ ]
    • Use our @skip_unless_tool_is_at_least decorator (NB. at_least) over @skip_if_tool_version_is (NB. is) to fix tests under Debian stable. [ ]
    • Emit a warning if/when we are handling a UNIX TERM signal. [ ]
  • Codebase improvements:
    • Clarify in what situations the main finally block gets called with respect to TERM signal handling. [ ]
    • Clarify control flow in the diffoscope.profiling module. [ ]
    • Correctly package the scripts/ directory. [ ]
In addition, Edward Betts updated a broken link to the RSS on the diffoscope homepage and Vagrant Cascadian updated the diffoscope package in GNU Guix [ ][ ][ ].

Upstream patches The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of such patches, including:

Testing framework The Reproducible Builds project runs a significant testing framework at tests.reproducible-builds.org, to check packages and other artifacts for reproducibility. This month, the following changes were made:
  • Holger Levsen:
    • Add a package set for packages that use the R programming language [ ] as well as one for Rust [ ].
    • Improve package set matching for Python [ ] and font-related [ ] packages.
    • Install the lz4, lzop and xz-utils packages on all nodes in order to detect running kernels. [ ]
    • Improve the cleanup mechanisms when testing the reproducibility of Debian Live images. [ ][ ]
    • In the automated node health checks, deprioritise the generic kernel warning . [ ]
  • Roland Clobus (Debian Live image reproducibility):
    • Add various maintenance jobs to the Jenkins view. [ ]
    • Cleanup old workspaces after 24 hours. [ ]
    • Cleanup temporary workspace and resulting directories. [ ]
    • Implement a number of fixes and improvements around publishing files. [ ][ ][ ]
    • Don t attempt to preserve the file timestamps when copying artifacts. [ ]
And finally, node maintenance was also performed by Mattia Rizzolo [ ].

Mailing list and website On our mailing list this month: Lastly, Chris Lamb updated the main Reproducible Builds website and documentation in a number of small ways, but primarily published an interview with Hans-Christoph Steiner of the F-Droid project. Chris Lamb also added a Coffeescript example for parsing and using the SOURCE_DATE_EPOCH environment variable [ ]. In addition, Sebastian Crane very-helpfully updated the screenshot of salsa.debian.org s request access button on the How to join the Salsa group. [ ]

Contact If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

6 June 2022

Reproducible Builds: Reproducible Builds in May 2022

Welcome to the May 2022 report from the Reproducible Builds project. In our reports we outline the most important things that we have been up to over the past month. As ever, if you are interested in contributing to the project, please visit our Contribute page on our website.

Repfix paper Zhilei Ren, Shiwei Sun, Jifeng Xuan, Xiaochen Li, Zhide Zhou and He Jiang have published an academic paper titled Automated Patching for Unreproducible Builds:
[..] fixing unreproducible build issues poses a set of challenges [..], among which we consider the localization granularity and the historical knowledge utilization as the most significant ones. To tackle these challenges, we propose a novel approach [called] RepFix that combines tracing-based fine-grained localization with history-based patch generation mechanisms.
The paper (PDF, 3.5MB) uses the Debian mylvmbackup package as an example to show how RepFix can automatically generate patches to make software build reproducibly. As it happens, Reiner Herrmann submitted a patch for the mylvmbackup package which has remained unapplied by the Debian package maintainer for over seven years, thus this paper inadvertently underscores that achieving reproducible builds will require both technical and social solutions.

Python variables Johannes Schauer discovered a fascinating bug where simply naming your Python variable _m led to unreproducible .pyc files. In particular, the types module in Python 3.10 requires the following patch to make it reproducible:
--- a/Lib/types.py
+++ b/Lib/types.py
@@ -37,8 +37,8 @@ _ag = _ag()
 AsyncGeneratorType = type(_ag)
 
 class _C:
-    def _m(self): pass
-MethodType = type(_C()._m)
+    def _b(self): pass
+MethodType = type(_C()._b)
Simply renaming the dummy method from _m to _b was enough to workaround the problem. Johannes bug report first led to a number of improvements in diffoscope to aid in dissecting .pyc files, but upstream identified this as caused by an issue surrounding interned strings and is being tracked in CPython bug #78274.

New SPDX team to incorporate build metadata in Software Bill of Materials SPDX, the open standard for Software Bill of Materials (SBOM), is continuously developed by a number of teams and committees. However, SPDX has welcomed a new addition; a team dedicated to enhancing metadata about software builds, complementing reproducible builds in creating a more secure software supply chain. The SPDX Builds Team has been working throughout May to define the universal primitives shared by all build systems, including the who, what, where and how of builds:
  • Who: the identity of the person or organisation that controls the build infrastructure.
  • What: the inputs and outputs of a given build, combining metadata about the build s configuration with an SBOM describing source code and dependencies.
  • Where: the software packages making up the build system, from build orchestration tools such as Woodpecker CI and Tekton to language-specific tools.
  • How: the invocation of a build, linking metadata of a build to the identity of the person or automation tool that initiated it.
The SPDX Builds Team expects to have a usable data model by September, ready for inclusion in the SPDX 3.0 standard. The team welcomes new contributors, inviting those interested in joining to introduce themselves on the SPDX-Tech mailing list.

Talks at Debian Reunion Hamburg Some of the Reproducible Builds team (Holger Levsen, Mattia Rizzolo, Roland Clobus, Philip Rinn, etc.) met in real life at the Debian Reunion Hamburg (official homepage). There were several informal discussions amongst them, as well as two talks related to reproducible builds. First, Holger Levsen gave a talk on the status of Reproducible Builds for bullseye and bookworm and beyond (WebM, 210MB): Secondly, Roland Clobus gave a talk called Reproducible builds as applied to non-compiler output (WebM, 115MB):

Supply-chain security attacks This was another bumper month for supply-chain attacks in package repositories. Early in the month, Lance R. Vick noticed that the maintainer of the NPM foreach package let their personal email domain expire, so they bought it and now controls foreach on NPM and the 36,826 projects that depend on it . Shortly afterwards, Drew DeVault published a related blog post titled When will we learn? that offers a brief timeline of major incidents in this area and, not uncontroversially, suggests that the correct way to ship packages is with your distribution s package manager .

Bootstrapping Bootstrapping is a process for building software tools progressively from a primitive compiler tool and source language up to a full Linux development environment with GCC, etc. This is important given the amount of trust we put in existing compiler binaries. This month, a bootstrappable mini-kernel was announced. Called boot2now, it comprises a series of compilers in the form of bootable machine images.

Google s new Assured Open Source Software service Google Cloud (the division responsible for the Google Compute Engine) announced a new Assured Open Source Software service. Noting the considerable 650% year-over-year increase in cyberattacks aimed at open source suppliers, the new service claims to enable enterprise and public sector users of open source software to easily incorporate the same OSS packages that Google uses into their own developer workflows . The announcement goes on to enumerate that packages curated by the new service would be:
  • Regularly scanned, analyzed, and fuzz-tested for vulnerabilities.
  • Have corresponding enriched metadata incorporating Container/Artifact Analysis data.
  • Are built with Cloud Build including evidence of verifiable SLSA-compliance
  • Are verifiably signed by Google.
  • Are distributed from an Artifact Registry secured and protected by Google.
(Full announcement)

A retrospective on the Rust programming language Andrew bunnie Huang published a long blog post this month promising a critical retrospective on the Rust programming language. Amongst many acute observations about the evolution of the language s syntax (etc.), the post beings to critique the languages approach to supply chain security ( Rust Has A Limited View of Supply Chain Security ) and reproducibility ( You Can t Reproduce Someone Else s Rust Build ):
There s some bugs open with the Rust maintainers to address reproducible builds, but with the number of issues they have to deal with in the language, I am not optimistic that this problem will be resolved anytime soon. Assuming the only driver of the unreproducibility is the inclusion of OS paths in the binary, one fix to this would be to re-configure our build system to run in some sort of a chroot environment or a virtual machine that fixes the paths in a way that almost anyone else could reproduce. I say almost anyone else because this fix would be OS-dependent, so we d be able to get reproducible builds under, for example, Linux, but it would not help Windows users where chroot environments are not a thing.
(Full post)

Reproducible Builds IRC meeting The minutes and logs from our May 2022 IRC meeting have been published. In case you missed this one, our next IRC meeting will take place on Tuesday 28th June at 15:00 UTC on #reproducible-builds on the OFTC network.

A new tool to improve supply-chain security in Arch Linux kpcyrd published yet another interesting tool related to reproducibility. Writing about the tool in a recent blog post, kpcyrd mentions that although many PKGBUILDs provide authentication in the context of signed Git tags (i.e. the ability to verify the Git tag was signed by one of the two trusted keys ), they do not support pinning, ie. that upstream could create a new signed Git tag with an identical name, and arbitrarily change the source code without the [maintainer] noticing . Conversely, other PKGBUILDs support pinning but not authentication. The new tool, auth-tarball-from-git, fixes both problems, as nearly outlined in kpcyrd s original blog post.

diffoscope diffoscope is our in-depth and content-aware diff utility. Not only can it locate and diagnose reproducibility issues, it can provide human-readable diffs from many kinds of binary formats. This month, Chris Lamb prepared and uploaded versions 212, 213 and 214 to Debian unstable. Chris also made the following changes:
  • New features:
    • Add support for extracting vmlinuz Linux kernel images. [ ]
    • Support both python-argcomplete 1.x and 2.x. [ ]
    • Strip sticky etc. from x.deb: sticky Debian binary package [ ]. [ ]
    • Integrate test coverage with GitLab s concept of artifacts. [ ][ ][ ]
  • Bug fixes:
    • Don t mask differences in .zip or .jar central directory extra fields. [ ]
    • Don t show a binary comparison of .zip or .jar files if we have observed at least one nested difference. [ ]
  • Codebase improvements:
    • Substantially update comment for our calls to zipinfo and zipinfo -v. [ ]
    • Use assert_diff in test_zip over calling get_data with a separate assert. [ ]
    • Don t call re.compile and then call .sub on the result; just call re.sub directly. [ ]
    • Clarify the comment around the difference between --usage and --help. [ ]
  • Testsuite improvements:
    • Test --help and --usage. [ ]
    • Test that --help includes the file formats. [ ]
Vagrant Cascadian added an external tool reference xb-tool for GNU Guix [ ] as well as updated the diffoscope package in GNU Guix itself [ ][ ][ ].

Distribution work In Debian, 41 reviews of Debian packages were added, 85 were updated and 13 were removed this month adding to our knowledge about identified issues. A number of issue types have been updated, including adding a new nondeterministic_ordering_in_deprecated_items_collected_by_doxygen toolchain issue [ ] as well as ones for mono_mastersummary_xml_files_inherit_filesystem_ordering [ ], extended_attributes_in_jar_file_created_without_manifest [ ] and apxs_captures_build_path [ ]. Vagrant Cascadian performed a rough check of the reproducibility of core package sets in GNU Guix, and in openSUSE, Bernhard M. Wiedemann posted his usual monthly reproducible builds status report.

Upstream patches The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of such patches, including:

Reproducible builds website Chris Lamb updated the main Reproducible Builds website and documentation in a number of small ways, but also prepared and published an interview with Jan Nieuwenhuizen about Bootstrappable Builds, GNU Mes and GNU Guix. [ ][ ][ ][ ] In addition, Tim Jones added a link to the Talos Linux project [ ] and billchenchina fixed a dead link [ ].

Testing framework The Reproducible Builds project runs a significant testing framework at tests.reproducible-builds.org, to check packages and other artifacts for reproducibility. This month, the following changes were made:
  • Holger Levsen:
    • Add support for detecting running kernels that require attention. [ ]
    • Temporarily configure a host to support performing Debian builds for packages that lack .buildinfo files. [ ]
    • Update generated webpages to clarify wishes for feedback. [ ]
    • Update copyright years on various scripts. [ ]
  • Mattia Rizzolo:
    • Provide a facility so that Debian Live image generation can copy a file remotely. [ ][ ][ ][ ]
  • Roland Clobus:
    • Add initial support for testing generated images with OpenQA. [ ]
And finally, as usual, node maintenance was also performed by Holger Levsen [ ][ ].

Misc news On our mailing list this month:

Contact If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

5 May 2022

Reproducible Builds: Reproducible Builds in April 2022

Welcome to the April 2022 report from the Reproducible Builds project! In these reports, we try to summarise the most important things that we have been up to over the past month. If you are interested in contributing to the project, please take a few moments to visit our Contribute page on our website.

News Cory Doctorow published an interesting article this month about the possibility of Undetectable backdoors for machine learning models. Given that machine learning models can provide unpredictably incorrect results, Doctorow recounts that there exists another category of adversarial examples that comprise a gimmicked machine-learning input that, to the human eye, seems totally normal but which causes the ML system to misfire dramatically that permit the possibility of planting undetectable back doors into any machine learning system at training time .
Chris Lamb published two supporter spotlights on our blog: the first about Amateur Radio Digital Communications (ARDC) and the second about the Google Open Source Security Team (GOSST).
Piergiorgio Ladisa, Henrik Plate, Matias Martinez and Olivier Barais published a new academic paper titled A Taxonomy of Attacks on Open-Source Software Supply Chains (PDF):
This work proposes a general taxonomy for attacks on open-source supply chains, independent of specific programming languages or ecosystems, and covering all supply chain stages from code contributions to package distribution. Taking the form of an attack tree, it covers 107 unique vectors, linked to 94 real-world incidents, and mapped to 33 mitigating safeguards.

Elsewhere in academia, Ly Vu Duc published his PhD thesis. Titled Towards Understanding and Securing the OSS Supply Chain (PDF), Duc s abstract reads as follows:
This dissertation starts from the first link in the software supply chain, developers . Since many developers do not update their vulnerable software libraries, thus exposing the user of their code to security risks. To understand how they choose, manage and update the libraries, packages, and other Open-Source Software (OSS) that become the building blocks of companies completed products consumed by end-users, twenty-five semi-structured interviews were conducted with developers of both large and small-medium enterprises in nine countries. All interviews were transcribed, coded, and analyzed according to applied thematic analysis

Upstream news Filippo Valsorda published an informative blog post recently called How Go Mitigates Supply Chain Attacks outlining the high-level features of the Go ecosystem that helps prevent various supply-chain attacks.
There was new/further activity on a pull request filed against openssl by Sebastian Andrzej Siewior in order to prevent saved CFLAGS (which may contain the -fdebug-prefix-map=<PATH> flag that is used to strip an arbitrary the build path from the debug info if this information remains recorded then the binary is no longer reproducible if the build directory changes.

Events The Linux Foundation s SupplyChainSecurityCon, will take place June 21st 24th 2022, both virtually and in Austin, Texas. Long-time Reproducible Builds and openSUSE contributor Bernhard M. Wiedemann learned that he had his talk accepted, and will speak on Reproducible Builds: Unexpected Benefits and Problems on June 21st.
There will be an in-person Debian Reunion in Hamburg, Germany later this year, taking place from 23 30 May. Although this is a Debian event, there will be some folks from the broader Reproducible Builds community and, of course, everyone is welcome. Please see the event page on the Debian wiki for more information. 41 people have registered so far, and there s approx 10 on-site beds still left.
The minutes and logs from our April 2022 IRC meeting have been published. In case you missed this one, our next IRC meeting will take place on May 31st at 15:00 UTC on #reproducible-builds on the OFTC network.

Debian Roland Clobus wrote another in-depth status update about the status of live Debian images, summarising the current situation that all major desktops build reproducibly with bullseye, bookworm and sid, including the Cinnamon desktop on bookworm and sid, but at a small functionality cost: 14 words will be incorrectly abbreviated . This work incorporated:
  • Reporting an issue about unnecessarily modified timestamps in the daily Debian installer images. [ ]
  • Reporting a bug against the debian-installer: in order to use a suitable kernel version. (#1006800)
  • Reporting a bug in: texlive-binaries regarding the unreproducible content of .fmt files. (#1009196)
  • Adding hacks to make the Cinnamon desktop image reproducible in bookworm and sid. [ ]
  • Added a script to rebuild a live-build ISO image from a given timestamp. [
  • etc.
On our mailing list, Venkata Pyla started a thread on the Debian debconf cache is non-reproducible issue while creating system images and Vagrant Cascadian posted an excellent summary of the reproducibility status of core package sets in Debian and solicited for similar information from other distributions.
Lastly, 122 reviews of Debian packages were added, 44 were updated and 193 were removed this month adding to our extensive knowledge about identified issues. A number of issue types have been updated as well, including timestamps_generated_by_hevea, randomness_in_ocaml_preprocessed_files, build_path_captured_in_emacs_el_file, golang_compiler_captures_build_path_in_binary and build_path_captured_in_assembly_objects,

Other distributions Happy birthday to GNU Guix, which recently turned 10 years old! People have been sharing their stories, in which reproducible builds and bootstrappable builds are a recurring theme as a feature important to its users and developers. The experiences are available on the GNU Guix blog as well as a post on fossandcrafts.org
In openSUSE, Bernhard M. Wiedemann posted his usual monthly reproducible builds status report.

Upstream patches The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of such patches, including:

diffoscope diffoscope is our in-depth and content-aware diff utility. Not only can it locate and diagnose reproducibility issues, it can provide human-readable diffs from many kinds of binary formats. This month, Chris Lamb prepared and uploaded versions 210 and 211 to Debian unstable, as well as noticed that some Python .pyc files are reported as data, so we should support .pyc as a fallback filename extension [ ]. In addition, Mattia Rizzolo disabled the Gnumeric tests in Debian as the package is not currently available [ ] and dropped mplayer from Build-Depends too [ ]. In addition, Mattia fixed an issue to ensure that the PATH environment variable is properly modified for all actions, not just when running the comparator. [ ]

Testing framework The Reproducible Builds project runs a significant testing framework at tests.reproducible-builds.org, to check packages and other artifacts for reproducibility. This month, the following changes were made:
  • Daniel Golle:
    • Prefer a different solution to avoid building all OpenWrt packages; skip packages from optional community feeds. [ ]
  • Holger Levsen:
    • Detect Python deprecation warnings in the node health check. [ ]
    • Detect failure to build the Debian Installer. [ ]
  • Mattia Rizzolo:
    • Install disorderfs for building OpenWrt packages. [ ]
  • Paul Spooren (OpenWrt-related changes):
    • Don t build all packages whilst the core packages are not yet reproducible. [ ]
    • Add a missing RUN directive to node_cleanup. [ ]
    • Be less verbose during a toolchain build. [ ]
    • Use disorderfs for rebuilds and update the documentation to match. [ ][ ][ ]
  • Roland Clobus:
    • Publish the last reproducible Debian ISO image. [ ]
    • Use the rebuild.sh script from the live-build package. [ ]
Lastly, node maintenance was also performed by Holger Levsen [ ][ ].
If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

28 April 2022

Reproducible Builds (diffoscope): diffoscope 211 released

The diffoscope maintainers are pleased to announce the release of diffoscope version 211. This version includes the following changes:
[ Mattia Rizzolo ]
* Drop mplayer from the Build-Depends, it was add likely by accident and it's
  not needed.
* Disable gnumeric tests in Debian because it's not currently available.
You find out more by visiting the project homepage.

15 April 2022

Reproducible Builds (diffoscope): diffoscope 210 released

The diffoscope maintainers are pleased to announce the release of diffoscope version 210. This version includes the following changes:
[ Mattia Rizzolo ]
* Make sure that PATH is properly mangled for all diffoscope actions, not
  just when running comparators.
You find out more by visiting the project homepage.

8 April 2022

Reproducible Builds: Reproducible Builds in March 2022

Welcome to the March 2022 report from the Reproducible Builds project! In our monthly reports we outline the most important things that we have been up to over the past month.
The in-toto project was accepted as an incubating project within the Cloud Native Computing Foundation (CNCF). in-toto is a framework that protects the software supply chain by collecting and verifying relevant data. It does so by enabling libraries to collect information about software supply chain actions and then allowing software users and/or project managers to publish policies about software supply chain practices that can be verified before deploying or installing software. CNCF foundations hosts a number of critical components of the global technology infrastructure under the auspices of the Linux Foundation. (View full announcement.)
Herv Boutemy posted to our mailing list with an announcement that the Java Reproducible Central has hit the milestone of 500 fully reproduced builds of upstream projects . Indeed, at the time of writing, according to the nightly rebuild results, 530 releases were found to be fully reproducible, with 100% reproducible artifacts.
GitBOM is relatively new project to enable build tools trace every source file that is incorporated into build artifacts. As an experiment and/or proof-of-concept, the GitBOM developers are rebuilding Debian to generate side-channel build metadata for versions of Debian that have already been released. This only works because Debian is (partial) reproducible, so one can be sure that that, if the case where build artifacts are identical, any metadata generated during these instrumented builds applies to the binaries that were built and released in the past. More information on their approach is available in README file in the bomsh repository.
Ludovic Courtes has published an academic paper discussing how the performance requirements of high-performance computing are not (as usually assumed) at odds with reproducible builds. The received wisdom is that vendor-specific libraries and platform-specific CPU extensions have resulted in a culture of local recompilation to ensure the best performance, rendering the property of reproducibility unobtainable or even meaningless. In his paper, Ludovic explains how Guix has:
[ ] implemented what we call package multi-versioning for C/C++ software that lacks function multi-versioning and run-time dispatch [ ]. It is another way to ensure that users do not have to trade reproducibility for performance. (full PDF)

Kit Martin posted to the FOSSA blog a post titled The Three Pillars of Reproducible Builds. Inspired by the shock of infiltrated or intentionally broken NPM packages, supply chain attacks, long-unnoticed backdoors , the post goes on to outline the high-level steps that lead to a reproducible build:
It is one thing to talk about reproducible builds and how they strengthen software supply chain security, but it s quite another to effectively configure a reproducible build. Concrete steps for specific languages are a far larger topic than can be covered in a single blog post, but today we ll be talking about some guiding principles when designing reproducible builds. [ ]
The article was discussed on Hacker News.
Finally, Bernhard M. Wiedemann noticed that the GNU Helloworld project varies depending on whether it is being built during a full moon! (Reddit announcement, openSUSE bug report)

Events There will be an in-person Debian Reunion in Hamburg, Germany later this year, taking place from 23 30 May. Although this is a Debian event, there will be some folks from the broader Reproducible Builds community and, of course, everyone is welcome. Please see the event page on the Debian wiki for more information. Bernhard M. Wiedemann posted to our mailing list about a meetup for Reproducible Builds folks at the openSUSE conference in Nuremberg, Germany. It was also recently announced that DebConf22 will take place this year as an in-person conference in Prizren, Kosovo. The pre-conference meeting (or Debcamp ) will take place from 10 16 July, and the main talks, workshops, etc. will take place from 17 24 July.

Misc news Holger Levsen updated the Reproducible Builds website to improve the documentation for the SOURCE_DATE_EPOCH environment variable, both by expanding parts of the existing text [ ][ ] as well as clarifying meaning by removing text in other places [ ]. In addition, Chris Lamb added a Twitter Card to our website s metadata too [ ][ ][ ]. On our mailing list this month:

Distribution work In Debian this month:
  • Johannes Schauer Marin Rodrigues posted to the debian-devel list mentioning that he exploited the property of reproducibility within Debian to demonstrate that automatically converting a large number of packages to a new internal source version did not change the resulting packages. The proposed change could therefore be applied without causing breakage:
So now we have 364 source packages for which we have a patch and for which we can show that this patch does not change the build output. Do you agree that with those two properties, the advantages of the 3.0 (quilt) format are sufficient such that the change shall be implemented at least for those 364? [ ]
In openSUSE, Bernhard M. Wiedemann posted his usual monthly reproducible builds status report.

Tooling diffoscope is our in-depth and content-aware diff utility. Not only can it locate and diagnose reproducibility issues, it can provide human-readable diffs from many kinds of binary formats. This month, Chris Lamb prepared and uploaded versions 207, 208 and 209 to Debian unstable, as well as made the following changes to the code itself:
  • Update minimum version of Black to prevent test failure on Ubuntu jammy. [ ]
  • Updated the R test fixture for the 4.2.x series of the R programming language. [ ]
Brent Spillner also worked on adding graceful handling for UNIX sockets and named pipes to diffoscope. [ ][ ][ ]. Vagrant Cascadian also updated the diffoscope package in GNU Guix. [ ][ ] reprotest is the Reproducible Build s project end-user tool to build the same source code twice in widely different environments and checking whether the binaries produced by the builds have any differences. This month, Santiago Ruano Rinc n added a new --append-build-command option [ ], which was subsequently uploaded to Debian unstable by Holger Levsen.

Upstream patches The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of such patches, including:

Testing framework The Reproducible Builds project runs a significant testing framework at tests.reproducible-builds.org, to check packages and other artifacts for reproducibility. This month, the following changes were made:
  • Holger Levsen:
    • Replace a local copy of the dsa-check-running-kernel script with a packaged version. [ ]
    • Don t hide the status of offline hosts in the Jenkins shell monitor. [ ]
    • Detect undefined service problems in the node health check. [ ]
    • Update the sources.lst file for our mail server as its still running Debian buster. [ ]
    • Add our mail server to our node inventory so it is included in the Jenkins maintenance processes. [ ]
    • Remove the debsecan package everywhere; it got installed accidentally via the Recommends relation. [ ]
    • Document the usage of the osuosl174 host. [ ]
Regular node maintenance was also performed by Holger Levsen [ ], Vagrant Cascadian [ ][ ][ ] and Mattia Rizzolo.
If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

5 April 2022

Kees Cook: security things in Linux v5.10

Previously: v5.9 Linux v5.10 was released in December, 2020. Here s my summary of various security things that I found interesting: AMD SEV-ES
While guest VM memory encryption with AMD SEV has been supported for a while, Joerg Roedel, Thomas Lendacky, and others added register state encryption (SEV-ES). This means it s even harder for a VM host to reconstruct a guest VM s state. x86 static calls
Josh Poimboeuf and Peter Zijlstra implemented static calls for x86, which operates very similarly to the static branch infrastructure in the kernel. With static branches, an if/else choice can be hard-coded, instead of being run-time evaluated every time. Such branches can be updated too (the kernel just rewrites the code to switch around the branch ). All these principles apply to static calls as well, but they re for replacing indirect function calls (i.e. a call through a function pointer) with a direct call (i.e. a hard-coded call address). This eliminates the need for Spectre mitigations (e.g. RETPOLINE) for these indirect calls, and avoids a memory lookup for the pointer. For hot-path code (like the scheduler), this has a measurable performance impact. It also serves as a kind of Control Flow Integrity implementation: an indirect call got removed, and the potential destinations have been explicitly identified at compile-time. network RNG improvements
In an effort to improve the pseudo-random number generator used by the network subsystem (for things like port numbers and packet sequence numbers), Linux s home-grown pRNG has been replaced by the SipHash round function, and perturbed by (hopefully) hard-to-predict internal kernel states. This should make it very hard to brute force the internal state of the pRNG and make predictions about future random numbers just from examining network traffic. Similarly, ICMP s global rate limiter was adjusted to avoid leaking details of network state, as a start to fixing recent DNS Cache Poisoning attacks. SafeSetID handles GID
Thomas Cedeno improved the SafeSetID LSM to handle group IDs (which required teaching the kernel about which syscalls were actually performing setgid.) Like the earlier setuid policy, this lets the system owner define an explicit list of allowed group ID transitions under CAP_SETGID (instead of to just any group), providing a way to keep the power of granting this capability much more limited. (This isn t complete yet, though, since handling setgroups() is still needed.) improve kernel s internal checking of file contents
The kernel provides LSMs (like the Integrity subsystem) with details about files as they re loaded. (For example, loading modules, new kernel images for kexec, and firmware.) There wasn t very good coverage for cases where the contents were coming from things that weren t files. To deal with this, new hooks were added that allow the LSMs to introspect the contents directly, and to do partial reads. This will give the LSMs much finer grain visibility into these kinds of operations. set_fs removal continues
With the earlier work landed to free the core kernel code from set_fs(), Christoph Hellwig made it possible for set_fs() to be optional for an architecture. Subsequently, he then removed set_fs() entirely for x86, riscv, and powerpc. These architectures will now be free from the entire class of kernel address limit attacks that only needed to corrupt a single value in struct thead_info. sysfs_emit() replaces sprintf() in /sys
Joe Perches tackled one of the most common bug classes with sprintf() and snprintf() in /sys handlers by creating a new helper, sysfs_emit(). This will handle the cases where kernel code was not correctly dealing with the length results from sprintf() calls, which might lead to buffer overflows in the PAGE_SIZE buffer that /sys handlers operate on. With the helper in place, it was possible to start the refactoring of the many sprintf() callers. nosymfollow mount option
Mattias Nissler and Ross Zwisler implemented the nosymfollow mount option. This entirely disables symlink resolution for the given filesystem, similar to other mount options where noexec disallows execve(), nosuid disallows setid bits, and nodev disallows device files. Quoting the patch, it is useful as a defensive measure for systems that need to deal with untrusted file systems in privileged contexts. (i.e. for when /proc/sys/fs/protected_symlinks isn t a big enough hammer.) Chrome OS uses this option for its stateful filesystem, as symlink traversal as been a common attack-persistence vector. ARMv8.5 Memory Tagging Extension support
Vincenzo Frascino added support to arm64 for the coming Memory Tagging Extension, which will be available for ARMv8.5 and later chips. It provides 4 bits of tags (covering multiples of 16 byte spans of the address space). This is enough to deterministically eliminate all linear heap buffer overflow flaws (1 tag for free , and then rotate even values and odd values for neighboring allocations), which is probably one of the most common bugs being currently exploited. It also makes use-after-free and over/under indexing much more difficult for attackers (but still possible if the target s tag bits can be exposed). Maybe some day we can switch to 128 bit virtual memory addresses and have fully versioned allocations. But for now, 16 tag values is better than none, though we do still need to wait for anyone to actually be shipping ARMv8.5 hardware. fixes for flaws found by UBSAN
The work to make UBSAN generally usable under syzkaller continues to bear fruit, with various fixes all over the kernel for stuff like shift-out-of-bounds, divide-by-zero, and integer overflow. Seeing these kinds of patches land reinforces the the rationale of shifting the burden of these kinds of checks to the toolchain: these run-time bugs continue to pop up. flexible array conversions
The work on flexible array conversions continues. Gustavo A. R. Silva and others continued to grind on the conversions, getting the kernel ever closer to being able to enable the -Warray-bounds compiler flag and clear the path for saner bounds checking of array indexes and memcpy() usage. That s it for now! Please let me know if you think anything else needs some attention. Next up is Linux v5.11.

2022, Kees Cook. This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 License.
CC BY-SA 4.0

5 March 2022

Reproducible Builds: Reproducible Builds in February 2022

Welcome to the February 2022 report from the Reproducible Builds project. In these reports, we try to round-up the important things we and others have been up to over the past month. As ever, if you are interested in contributing to the project, please visit our Contribute page on our website.
Jiawen Xiong, Yong Shi, Boyuan Chen, Filipe R. Cogo and Zhen Ming Jiang have published a new paper titled Towards Build Verifiability for Java-based Systems (PDF). The abstract of the paper contains the following:
Various efforts towards build verifiability have been made to C/C++-based systems, yet the techniques for Java-based systems are not systematic and are often specific to a particular build tool (eg. Maven). In this study, we present a systematic approach towards build verifiability on Java-based systems.

GitBOM is a flexible scheme to track the source code used to generate build artifacts via Git-like unique identifiers. Although the project has been active for a while, the community around GitBOM has now started running weekly community meetings.
The paper Chris Lamb and Stefano Zacchiroli is now available in the March/April 2022 issue of IEEE Software. Titled Reproducible Builds: Increasing the Integrity of Software Supply Chains (PDF), the abstract of the paper contains the following:
We first define the problem, and then provide insight into the challenges of making real-world software build in a reproducible manner-this is, when every build generates bit-for-bit identical results. Through the experience of the Reproducible Builds project making the Debian Linux distribution reproducible, we also describe the affinity between reproducibility and quality assurance (QA).

In openSUSE, Bernhard M. Wiedemann posted his monthly reproducible builds status report.
On our mailing list this month, Thomas Schmitt started a thread around the SOURCE_DATE_EPOCH specification related to formats that cannot help embedding potentially timezone-specific timestamp. (Full thread index.)
The Yocto Project is pleased to report that it s core metadata (OpenEmbedded-Core) is now reproducible for all recipes (100% coverage) after issues with newer languages such as Golang were resolved. This was announced in their recent Year in Review publication. It is of particular interest for security updates so that systems can have specific components updated but reducing the risk of other unintended changes and making the sections of the system changing very clear for audit. The project is now also making heavy use of equivalence of build output to determine whether further items in builds need to be rebuilt or whether cached previously built items can be used. As mentioned in the article above, there are now public servers sharing this equivalence information. Reproducibility is key in making this possible and effective to reduce build times/costs/resource usage.

diffoscope diffoscope is our in-depth and content-aware diff utility. Not only can it locate and diagnose reproducibility issues, it can provide human-readable diffs from many kinds of binary formats. This month, Chris Lamb prepared and uploaded versions 203, 204, 205 and 206 to Debian unstable, as well as made the following changes to the code itself:
  • Bug fixes:
    • Fix a file(1)-related regression where Debian .changes files that contained non-ASCII text were not identified as such, therefore resulting in seemingly arbitrary packages not actually comparing the nested files themselves. The non-ASCII parts were typically in the Maintainer or in the changelog text. [ ][ ]
    • Fix a regression when comparing directories against non-directories. [ ][ ]
    • If we fail to scan using binwalk, return False from BinwalkFile.recognizes. [ ]
    • If we fail to import binwalk, don t report that we are missing the Python rpm module! [ ]
  • Testsuite improvements:
    • Add a test for recent file(1) issue regarding .changes files. [ ]
    • Use our assert_diff utility where we can within the test_directory.py set of tests. [ ]
    • Don t run our binwalk-related tests as root or fakeroot. The latest version of binwalk has some new security protection against this. [ ]
  • Codebase improvements:
    • Drop the _PATH suffix from module-level globals that are not paths. [ ]
    • Tidy some control flow in Difference._reverse_self. [ ]
    • Don t print a warning to the console regarding NT_GNU_BUILD_ID changes. [ ]
In addition, Mattia Rizzolo updated the Debian packaging to ensure that diffoscope and diffoscope-minimal packages have the same version. [ ]

Website updates There were quite a few changes to the Reproducible Builds website and documentation this month as well, including:
  • Chris Lamb:
    • Considerably rework the Who is involved? page. [ ][ ]
    • Move the contributors.sh Bash/shell script into a Python script. [ ][ ][ ]
  • Daniel Shahaf:
    • Try a different Markdown footnote content syntax to work around a rendering issue. [ ][ ][ ]
  • Holger Levsen:
    • Make a huge number of changes to the Who is involved? page, including pre-populating a large number of contributors who cannot be identified from the metadata of the website itself. [ ][ ][ ][ ][ ]
    • Improve linking to sponsors in sidebar navigation. [ ]
    • drop sponsors paragraph as the navigation is clearer now. [ ]
    • Add Mullvad VPN as a bronze-level sponsor . [ ][ ]
  • Vagrant Cascadian:

Upstream patches The Reproducible Builds project attempts to fix as many currently-unreproducible packages as possible. February s patches included the following:

Testing framework The Reproducible Builds project runs a significant testing framework at tests.reproducible-builds.org, to check packages and other artifacts for reproducibility. This month, the following changes were made:
  • Daniel Golle:
    • Update the OpenWrt configuration to not depend on the host LLVM, adding lines to the .config seed to build LLVM for eBPF from source. [ ]
    • Preserve more OpenWrt-related build artifacts. [ ]
  • Holger Levsen:
  • Temporary use a different Git tree when building OpenWrt as our tests had been broken since September 2020. This was reverted after the patch in question was accepted by Paul Spooren into the canonical openwrt.git repository the next day.
    • Various improvements to debugging OpenWrt reproducibility. [ ][ ][ ][ ][ ]
    • Ignore useradd warnings when building packages. [ ]
    • Update the script to powercycle armhf architecture nodes to add a hint to where nodes named virt-*. [ ]
    • Update the node health check to also fix failed logrotate and man-db services. [ ]
  • Mattia Rizzolo:
    • Update the website job after contributors.sh script was rewritten in Python. [ ]
    • Make sure to set the DIFFOSCOPE environment variable when available. [ ]
  • Vagrant Cascadian:
    • Various updates to the diffoscope timeouts. [ ][ ][ ]
Node maintenance was also performed by Holger Levsen [ ] and Vagrant Cascadian [ ].

Finally If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

11 February 2022

Reproducible Builds (diffoscope): diffoscope 204 released

The diffoscope maintainers are pleased to announce the release of diffoscope version 204. This version includes the following changes:
[ Chris Lamb ]
* Don't run the binwalk comparator tests as root (or fakeroot) as the
  latest version of binwalk has some security protection against doing
  precisely this.
* If we fail to scan a file using binwalk, return 'False' from
  BinwalkFile.recognizes rather than raise a traceback.
* If we fail to import the Python "binwalk" module, don't accidentally report
  that we are missing the "rpm" module instead.
[ Mattia Rizzolo ]
* Use dependencies to ensure that "diffoscope" and "diffoscope-minimal"
  packages always have the precise same version.
You find out more by visiting the project homepage.

6 February 2022

Jonathan McDowell: Free Software Activities for 2021

About a month later than I probably should have posted it, here s a recap of my Free Software activities in 2021. For previous years see 2019 + 2020. Again, this year had fewer contributions than I d like thanks to continuing fatigue about the state of the world, and trying to work on separation between work and leisure while working from home. I ve made some effort to improve that balance but it s still a work in progress.

Conferences No surprise, I didn t attend any in-person conferences in 2021. I find virtual conferences don t do a lot for me (a combination of my not carving time out for them in the same way, because not being at the conference means other things will inevitably intrude, and the lack of the social side) but I did get to attend a few of the DebConf21 talks, which was nice. I m hoping to make it to DebConf22 this year in person.

Debian Most of my contributions to Free software continue to happen within Debian. As part of the Data Protection Team I responded to various inbound queries to that team. Some of this involved chasing up other project teams who had been slow to respond - folks, if you re running a service that stores personal data about people then you need to be responsive to requests about it. Some of this was dealing with what look like automated scraping tools which send no information about the person making the request, and in all the cases we ve seen so far there s been no indication of any data about that person on any systems we have access to. Further team time was wasted dealing with the Princeton-Radboud Study on Privacy Law Implementation (though Matthew did the majority of the work on this). The Debian Keyring was possibly my largest single point of contribution. We re in a roughly 3 month rotation of who handles the keyring updates, and I handled 2021.03.24, 2021.04.09, 2021.06.25, 2021.09.25 + 2021.12.24 For Debian New Members I m mostly inactive as an application manager - we generally seem to have enough available recently. If that changes I ll look at stepping in to help, but I don t see that happening. I continue to be involved in Front Desk, having various conversations throughout the year with the rest of the team, but there s no doubt Mattia and Pierre-Elliott are the real doers at present. I did take part in an NM Committee appeals process. In terms of package uploads I continued to work on gcc-xtensa-lx106, largely doing uploads to deal with updates to the GCC version or packaging (8 + 9). sigrok had a few minor updates, libsigkrok 0.5.2-3, pulseview 0.4.2-3 as well as a new upstream release of sigrok CLI 0.7.2-1. There was a last minute pre-release upload of libserialport 0.1.1-4 thanks to a kernel change in v5.10.37 which removed termiox support. Despite still not writing any VHDL these days I continue to keep an eye on ghdl, because I found it a useful tool in the past. Last year that was just a build fix for LLVM 11.1.0 - 1.0.0+dfsg+5. Andreas Bombe has largely taken over more proactive maintenance, which is nice to see. I uploaded OpenOCD 0.11.0~rc1-2, cleaning up some packaging / dependency issues. This was followed by 0.11.0~rc2-1 as a newer release candidate. Sadly 0.11.0 did not make it in time for bullseye, but rc2 was fairly close and I uploaded 0.11.0-1 once bullseye was released. Finally I did a drive-by upload for garmin-forerunner-tools 0.10repacked-12, cleaning up some packaging issues and uploading it to salsa. My Forerunner 305 has died (after 11 years of sterling service) and the Forerunner 45 I ve replaced it with uses a different set of tools, so I decided it didn t make sense to pick up longer term ownership of the package.

Linux My Linux contributions continued to revolve around pushing MikroTik RB3011 support upstream. There was a minor code change to Set FIFO sizes for ipq806x (which fixed up the allowed MTU for the internal switch + VLANs). The rest was DTS related - adding ADM DMA + NAND definitions now that the ADM driver was merged, adding tsens details, adding USB port info and adding the L2CC and RPM details for IPQ8064. Finally I was able to update the RB3011 DTS to enable NAND + USB. With all those in I m down to 4 local patches against a mainline kernel, all of which are hacks that aren t suitable for submission upstream. 2 are for patching in details of the root device and ethernet MAC addresses, one is dealing with the fact the IPQ8064 has some reserved memory that doesn t play well with AUTO_ZRELADDR (there keeps being efforts to add some support for this via devicetree, but unfortunately it gets shot down every time), and the final one is a hack to turn off the LCD backlight by treating it as an LED (actually supporting the LCD properly is on my TODO list).

Personal projects 2021 didn t see any releases of onak. It s not dead, just resting, but Sequoia PGP is probably where you should be looking for a modern OpenPGP implementation. I continued work on my Desk Viking project, which is an STM32F103 based debug tool inspired by the Bus Pirate. The main addtion was some CCLib support (forking it in the process to move to Python 3 and add some speed ups) to allow me to program my Zigbee dongles, but I also added some 1-Wire search logic and some support for Linux emulation mode with VCD output to allow for a faster development cycle. I really want to try and get OpenOCD JTAG mode supported at some point, and have vague plans for an STM32F4 based version that have suffered from a combination of a silicon shortage and a lack of time. That wraps up 2021. I d like to say I m hoping to make more Free Software contributions this year, but I don t have a concrete plan yet for how that might happen, so I ll have to wait and see.

5 February 2022

Reproducible Builds: Reproducible Builds in January 2022

Welcome to the January 2022 report from the Reproducible Builds project. In our reports, we try outline the most important things that have been happening in the past month. As ever, if you are interested in contributing to the project, please visit our Contribute page on our website.
An interesting blog post was published by Paragon Initiative Enterprises about Gossamer, a proposal for securing the PHP software supply-chain. Utilising code-signing and third-party attestations, Gossamer aims to mitigate the risks within the notorious PHP world via publishing attestations to a transparency log. Their post, titled Solving Open Source Supply Chain Security for the PHP Ecosystem goes into some detail regarding the design, scope and implementation of the system.
This month, the Linux Foundation announced SupplyChainSecurityCon, a conference focused on exploring the security threats affecting the software supply chain, sharing best practices and mitigation tactics. The conference is part of the Linux Foundation s Open Source Summit North America and will take place June 21st 24th 2022, both virtually and in Austin, Texas.

Debian There was a significant progress made in the Debian Linux distribution this month, including:

Other distributions kpcyrd reported on Twitter about the release of version 0.2.0 of pacman-bintrans, an experiment with binary transparency for the Arch Linux package manager, pacman. This new version is now able to query rebuilderd to check if a package was independently reproduced.
In the world of openSUSE, however, Bernhard M. Wiedemann posted his monthly reproducible builds status report.

diffoscope diffoscope is our in-depth and content-aware diff utility. Not only can it locate and diagnose reproducibility issues, it can provide human-readable diffs from many kinds of binary formats. This month, Chris Lamb prepared and uploaded versions 199, 200, 201 and 202 to Debian unstable (that were later backported to Debian bullseye-backports by Mattia Rizzolo), as well as made the following changes to the code itself:
  • New features:
    • First attempt at incremental output support with a timeout. Now passing, for example, --timeout=60 will mean that diffoscope will not recurse into any sub-archives after 60 seconds total execution time has elapsed. Note that this is not a fixed/strict timeout due to implementation issues. [ ][ ]
    • Support both variants of odt2txt, including the one provided by the unoconv package. [ ]
  • Bug fixes:
    • Do not return with a UNIX exit code of 0 if we encounter with a file whose human-readable metadata matches literal file contents. [ ]
    • Don t fail if comparing a nonexistent file with a .pyc file (and add test). [ ][ ]
    • If the debian.deb822 module raises any exception on import, re-raise it as an ImportError. This should fix diffoscope on some Fedora systems. [ ]
    • Even if a Sphinx .inv inventory file is labelled The remainder of this file is compressed using zlib, it might not actually be. In this case, don t traceback and simply return the original content. [ ]
  • Documentation:
    • Improve documentation for the new --timeout option due to a few misconceptions. [ ]
    • Drop reference in the manual page claiming the ability to compare non-existent files on the command-line. (This has not been possible since version 32 which was released in September 2015). [ ]
    • Update X has been modified after NT_GNU_BUILD_ID has been applied messages to, for example, not duplicating the full filename in the diffoscope output. [ ]
  • Codebase improvements:
    • Tidy some control flow. [ ]
    • Correct a recompile typo. [ ]
In addition, Alyssa Ross fixed the comparison of CBFS names that contain spaces [ ], Sergei Trofimovich fixed whitespace for compatibility with version 21.12 of the Black source code reformatter [ ] and Zbigniew J drzejewski-Szmek fixed JSON detection with a new version of file [ ].

Testing framework The Reproducible Builds project runs a significant testing framework at tests.reproducible-builds.org, to check packages and other artifacts for reproducibility. This month, the following changes were made:
  • Fr d ric Pierret (fepitre):
    • Add Debian bookworm to package set creation. [ ]
  • Holger Levsen:
    • Install the po4a package where appropriate, as it is needed for the Reproducible Builds website job [ ]. In addition, also run the i18n.sh and contributors.sh scripts [ ].
    • Correct some grammar in Debian live image build output. [ ]
    • Shell monitor improvements:
      • Only show the offline node section if there are offline nodes. [ ]
      • Colorise offline nodes. [ ]
      • Shrink screen usage. [ ][ ][ ]
    • Node health check improvements:
      • Detect if live package builds encounter incomplete snapshots. [ ][ ][ ]
      • Detect if a host is running with today s date (when it should be set artificially in the future). [ ]
    • Use the devscripts package from bullseye-backports on Debian nodes. [ ]
    • Use the Munin monitoring package bullseye-backports on Debian nodes too. [ ]
    • Update New Year handling, needed to be able to detect real and fake dates. [ ][ ]
    • Improve the error message of the script that powercycles the arm64 architecture nodes hosted by Codethink. [ ]
  • Mattia Rizzolo:
    • Use the new --timeout option added in diffoscope version 202. [ ]
  • Roland Clobus:
    • Update the build scripts now that the hooks for live builds are now maintained upstream in the live-build repository. [ ]
    • Show info lines in Jenkins when reproducible hooks have been active. [ ]
    • Use unique folders for the artifacts from each live Debian version. [ ]
  • Vagrant Cascadian:
    • Switch the Debian armhf architecture nodes to use new proxy. [ ]
    • Misc. node maintenance. [ ].

Upstream patches The Reproducible Builds project attempts to fix as many currently-unreproducible packages as possible. In January, we wrote a large number of such patches, including:

And finally If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

21 January 2022

Reproducible Builds (diffoscope): diffoscope 201 released

The diffoscope maintainers are pleased to announce the release of diffoscope version 201. This version includes the following changes:
[ Chris Lamb ]
* If the debian.deb822 module raises any exception on import, re-raise it as
  an ImportError instead. This should fix diffoscope on some Fedora systems.
  Thanks to Mattia Rizzolo for suggesting this particular solution.
  (Closes: reproducible-builds/diffoscope#300)
[ Zbigniew J drzejewski-Szmek ]
* Fix json detection with file-5.41-3.fc36.x86_64.
You find out more by visiting the project homepage.

5 January 2022

Reproducible Builds: Reproducible Builds in December 2021

Welcome to the December 2021 report from the Reproducible Builds project! In these reports, we try and summarise what we have been up to over the past month, as well as what else has been occurring in the world of software supply-chain security. As a quick recap of what reproducible builds is trying to address, whilst anyone may inspect the source code of free software for malicious flaws, almost all software is distributed to end users as pre-compiled binaries. The motivation behind the reproducible builds effort is to ensure no flaws have been introduced during this compilation process by promising identical results are always generated from a given source, thus allowing multiple third-parties to come to a consensus on whether a build was compromised. As always, if you would like to contribute to the project, please get in touch with us directly or visit the Contribute page on our website.
Early in December, Julien Voisin blogged about setting up a rebuilderd instance in order to reproduce Tails images. Working on previous work from 2018, Julien has now set up a public-facing instance which is providing build attestations. As Julien dryly notes in his post, Currently, this isn t really super-useful to anyone, except maybe some Tails developers who want to check that the release manager didn t backdoor the released image. Naturally, we would contend sincerely that this is indeed useful.
The secure/anonymous Tor browser now supports reproducible source releases. According to the project s changelog, version 0.4.7.3-alpha of Tor can now build reproducible tarballs via the make dist-reprod command. This issue was tracked via Tor issue #26299.
Fabian Keil posted a question to our mailing list this month asking how they might analyse differences in images produced with the FreeBSD and ElectroBSD s mkimg and makefs commands:
After rebasing ElectroBSD from FreeBSD stable/11 to stable/12
I recently noticed that the "memstick" images are unfortunately
still not 100% reproducible.
Fabian s original post generated a short back-and-forth with Chris Lamb regarding how diffoscope might be able to support the particular format of images generated by this command set.

diffoscope diffoscope is our in-depth and content-aware diff utility. Not only can it locate and diagnose reproducibility issues, it can provide human-readable diffs from many kinds of binary formats. This month, Chris Lamb prepared and uploading versions 195, 196, 197 and 198 to Debian, as well as made the following changes:
  • Support showing Ordering differences only within .dsc field values. [ ]
  • Add support for XMLb files. [ ]
  • Also add, for example, /usr/lib/x86_64-linux-gnu to our local binary search path. [ ]
  • Support OCaml versions 4.11, 4.12 and 4.13. [ ]
  • Drop some unnecessary has_same_content_as logging calls. [ ]
  • Replace token variable with an anonymously-named variable instead to remove extra lines. [ ]
  • Don t use the runtime platform s native endianness when unpacking .pyc files. This fixes test failures on big-endian machines. [ ]
Mattia Rizzolo also made a number of changes to diffoscope this month as well, such as:
  • Also recognize GnuCash files as XML. [ ]
  • Support the pgpdump PGP packet visualiser version 0.34. [ ]
  • Ignore the new Lintian tag binary-with-bad-dynamic-table. [ ]
  • Fix the Enhances field in debian/control. [ ]
Finally, Brent Spillner fixed the version detection for Black uncompromising code formatter [ ], Jelle van der Waa added an external tool reference for Arch Linux [ ] and Roland Clobus added support for reporting when the GNU_BUILD_ID field has been modified [ ]. Thank you for your contributions!

Distribution work In Debian this month, 70 reviews of packages were added, 27 were updated and 41 were removed, adding to our database of knowledge about specific issues. A number of issue types were created as well, including: strip-nondeterminism version 1.13.0-1 was uploaded to Debian unstable by Holger Levsen. It included contributions already covered in previous months as well as new ones from Mattia Rizzolo, particularly that the dh_strip_nondeterminism Debian integration interface uses the new get_non_binnmu_date_epoch() utility when available: this is important to ensure that strip-nondeterminism does not break some kinds of binNMUs.
In the world of openSUSE, however, Bernhard M. Wiedemann posted his monthly reproducible builds status report.
In NixOS, work towards the longer-term goal of making the graphical installation image reproducible is ongoing. For example, Artturin made the gnome-desktop package reproducible.

Upstream patches The Reproducible Builds project attempts to fix as many currently-unreproducible packages as possible. In December, we wrote a large number of such patches, including:

Testing framework The Reproducible Builds project runs a significant testing framework at tests.reproducible-builds.org, to check packages and other artifacts for reproducibility. This month, the following changes were made:
  • Holger Levsen:
    • Run the Debian scheduler less often. [ ]
    • Fix the name of the Debian testing suite name. [ ]
    • Detect builds that are rescheduling due to problems with the diffoscope container. [ ]
    • No longer special-case particular machines having a different /boot partition size. [ ]
    • Automatically fix failed apt-daily and apt-daily-upgrade services [ ], failed e2scrub_all.service & user@ systemd units [ ][ ] as well as generic build failures [ ].
    • Simplify a script to powercycle arm64 architecture nodes hosted at/by codethink.co.uk. [ ]
    • Detect if the udd-mirror.debian.net service is down. [ ]
    • Various miscellaneous node maintenance. [ ][ ]
  • Roland Clobus (Debian live image generation):
    • If the latest snapshot is not complete yet, try to use the previous snapshot instead. [ ]
    • Minor: whitespace correction + comment correction. [ ]
    • Use unique folders and reports for each Debian version. [ ]
    • Turn off debugging. [ ]
    • Add a better error description for incorrect/missing arguments. [ ]
    • Report non-reproducible issues in Debian sid images. [ ]
Lastly, Mattia Rizzolo updated the automatic logfile parsing rules in a number of ways (eg. to ignore a warning about the Python setuptools deprecation) [ ][ ] and Vagrant Cascadian adjusted the config for the Squid caching proxy on a node. [ ]

If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

31 December 2021

Reproducible Builds (diffoscope): diffoscope 198 released

The diffoscope maintainers are pleased to announce the release of diffoscope version 198. This version includes the following changes:
[ Chris Lamb ]
* Support showing "Ordering differences only" within .dsc field values.
  (Closes: #1002002, reproducible-builds/diffoscope#297)
* Support OCaml versions 4.11, 4.12 and 4.13. (Closes: #1002678)
* Add support for XMLb files. (Closes: reproducible-builds/diffoscope#295)
* Also add, for example, /usr/lib/x86_64-linux-gnu to our internal PATH.
[ Mattia Rizzolo ]
* Also recognize "GnuCash file" files as XML.
You find out more by visiting the project homepage.

17 December 2021

Reproducible Builds (diffoscope): diffoscope 197 released

The diffoscope maintainers are pleased to announce the release of diffoscope version 197. This version includes the following changes:
[ Chris Lamb ]
* Drop unnecessary has_same_content_as logging calls.
[ Mattia Rizzolo ]
* Ignore the new "binary-with-bad-dynamic-table" Lintian tag.
* Support pgpdump 0.34 in the tests. Thanks to Michael Weiss
  <dev.primeos@gmail.com> for reporting and testing the fix.
You find out more by visiting the project homepage.

5 December 2021

Reproducible Builds: Reproducible Builds in November 2021

Welcome to the November 2021 report from the Reproducible Builds project. As a quick recap, whilst anyone may inspect the source code of free software for malicious flaws, almost all software is distributed to end users as pre-compiled binaries. The motivation behind the reproducible builds effort is therefore to ensure no flaws have been introduced during this compilation process by promising identical results are always generated from a given source, thus allowing multiple third-parties to come to a consensus on whether a build was compromised. If you are interested in contributing to our project, please visit our Contribute page on our website.
On November 6th, Vagrant Cascadian presented at this year s edition of the SeaGL conference, giving a talk titled Debugging Reproducible Builds One Day at a Time:
I ll explore how I go about identifying issues to work on, learn more about the specific issues, recreate the problem locally, isolate the potential causes, dissect the problem into identifiable parts, and adapt the packaging and/or source code to fix the issues.
A video recording of the talk is available on archive.org.
Fedora Magazine published a post written by Zbigniew J drzejewski-Szmek about how to Use Diffoscope in packager workflows, specifically around ensuring that new versions of a package do not introduce breaking changes:
In the role of a packager, updating packages is a recurring task. For some projects, a packager is involved in upstream maintenance, or well written release notes make it easy to figure out what changed between the releases. This isn t always the case, for instance with some small project maintained by one or two people somewhere on GitHub, and it can be useful to verify what exactly changed. Diffoscope can help determine the changes between package releases. [ ]

kpcyrd announced the release of rebuilderd version 0.16.3 on our mailing list this month, adding support for builds to generate multiple artifacts at once.
Lastly, we held another IRC meeting on November 30th. As mentioned in previous reports, due to the global events throughout 2020 etc. there will be no in-person summit event this year.

diffoscope diffoscope is our in-depth and content-aware diff utility. Not only can it locate and diagnose reproducibility issues, it can provide human-readable diffs from many kinds of binary formats. This month, Chris Lamb made the following changes, including preparing and uploading versions 190, 191, 192, 193 and 194 to Debian:
  • New features:
    • Continue loading a .changes file even if the referenced files do not exist, but include a comment in the returned diff. [ ]
    • Log the reason if we cannot load a Debian .changes file. [ ]
  • Bug fixes:
    • Detect XML files as XML files if file(1) claims if they are XML files or if they are named .xml. (#999438)
    • Don t duplicate file lists at each directory level. (#989192)
    • Don t raise a traceback when comparing nested directories with non-directories. [ ]
    • Re-enable test_android_manifest. [ ]
    • Don t reject Debian .changes files if they contain non-printable characters. [ ]
  • Codebase improvements:
    • Avoid aliasing variables if we aren t going to use them. [ ]
    • Use isinstance over type. [ ]
    • Drop a number of unused imports. [ ]
    • Update a bunch of %-style string interpolations into f-strings or str.format. [ ]
    • When pretty-printing JSON, mark the difference as being reformatted, additionally avoiding including the full path. [ ]
    • Import itertools top-level module directly. [ ]
Chris Lamb also made an update to the command-line client to trydiffoscope, a web-based version of the diffoscope in-depth and content-aware diff utility, specifically only waiting for 2 minutes for try.diffoscope.org to respond in tests. (#998360) In addition Brandon Maier corrected an issue where parts of large diffs were missing from the output [ ], Zbigniew J drzejewski-Szmek fixed some logic in the assert_diff_startswith method [ ] and Mattia Rizzolo updated the packaging metadata to denote that we support both Python 3.9 and 3.10 [ ] as well as a number of warning-related changes[ ][ ]. Vagrant Cascadian also updated the diffoscope package in GNU Guix [ ][ ].

Distribution work In Debian, Roland Clobus updated the wiki page documenting Debian reproducible Live images to mention some new bug reports and also posted an in-depth status update to our mailing list. In addition, 90 reviews of Debian packages were added, 18 were updated and 23 were removed this month adding to our knowledge about identified issues. Chris Lamb identified a new toolchain issue, absolute_path_in_cmake_file_generated_by_meson.
Work has begun on classifying reproducibility issues in packages within the Arch Linux distribution. Similar to the analogous effort within Debian (outlined above), package information is listed in a human-readable packages.yml YAML file and a sibling README.md file shows how to classify packages too. Finally, Bernhard M. Wiedemann posted his monthly reproducible builds status report for openSUSE and Vagrant Cascadian updated a link on our website to link to the GNU Guix reproducibility testing overview [ ].

Software development The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of such patches, including: Elsewhere, in software development, Jonas Witschel updated strip-nondeterminism, our tool to remove specific non-deterministic results from a completed build so that it did not fail on JAR archives containing invalid members with a .jar extension [ ]. This change was later uploaded to Debian by Chris Lamb. reprotest is the Reproducible Build s project end-user tool to build the same source code twice in widely different environments and checking whether the binaries produced by the builds have any differences. This month, Mattia Rizzolo overhauled the Debian packaging [ ][ ][ ] and fixed a bug surrounding suffixes in the Debian package version [ ], whilst Stefano Rivera fixed an issue where the package tests were broken after the removal of diffoscope from the package s strict dependencies [ ].

Testing framework The Reproducible Builds project runs a testing framework at tests.reproducible-builds.org, to check packages and other artifacts for reproducibility. This month, the following changes were made:
  • Holger Levsen:
    • Document the progress in setting up snapshot.reproducible-builds.org. [ ]
    • Add the packages required for debian-snapshot. [ ]
    • Make the dstat package available on all Debian based systems. [ ]
    • Mark virt32b-armhf and virt64b-armhf as down. [ ]
  • Jochen Sprickerhof:
    • Add SSH authentication key and enable access to the osuosl168-amd64 node. [ ][ ]
  • Mattia Rizzolo:
    • Revert reproducible Debian: mark virt(32 64)b-armhf as down - restored. [ ]
  • Roland Clobus (Debian live image generation):
    • Rename sid internally to unstable until an issue in the snapshot system is resolved. [ ]
    • Extend testing to include Debian bookworm too.. [ ]
    • Automatically create the Jenkins view to display jobs related to building the Live images. [ ]
  • Vagrant Cascadian:
    • Add a Debian package set group for the packages and tools maintained by the Reproducible Builds maintainers themselves. [ ]


If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

19 November 2021

Reproducible Builds (diffoscope): diffoscope 193 released

The diffoscope maintainers are pleased to announce the release of diffoscope version 193. This version includes the following changes:
[ Chris Lamb ]
* Don't duplicate file lists at each directory level.
  (Closes: #989192, reproducible-builds/diffoscope#263)
* When pretty-printing JSON, mark the difference as such, additionally
  avoiding including the full path.
  (Closes: reproducible-builds/diffoscope#205)
* Codebase improvements:
  - Update a bunch of %-style string interpolations into f-strings or
    str.format.
  - Import itertools top-level directly.
  - Drop some unused imports.
  - Use isinstance(...) over type(...) ==
  - Avoid aliasing variables if we aren't going to use them.
[ Brandon Maier ]
* Fix missing diff output on large diffs.
[ Mattia Rizzolo ]
* Ignore a Python warning coming from a dependent library (triggered by
  supporting Python 3.10)
* Document that support both Python 3.9 and 3.10.
You find out more by visiting the project homepage.

Next.

Previous.